48 research outputs found

    Region-based Appearance and Flow Characteristics for Anomaly Detection in Infrared Surveillance Imagery

    Get PDF
    Anomaly detection is a classical problem within automated visual surveillance, namely the determination of the normal from the abnormal when operational data availability is highly biased towards one class (normal) due to both insufficient sample size, and inadequate distribution coverage for the other class (abnormal). In this work, we propose the dual use of both visual appearance and localized motion characteristics, derived from optic flow, applied on a per-region basis to facilitate object-wise anomaly detection within this context. Leveraging established object localization techniques from a region proposal network, optic flow is extracted from each object region and combined with appearance in the far infrared (thermal) band to give a 3-channel spatiotemporal tensor representation for each object (1 × thermal - spatial appearance; 2 × optic flow magnitude as x and y components - temporal motion). This formulation is used as the basis for training contemporary semi-supervised anomaly detection approaches in a region-based manner such that anomalous objects can be detected as a combination of appearance and/or motion within the scene. Evaluation is performed using the LongTerm infrared (thermal) Imaging (LTD) benchmark dataset against which successful detection of both anomalous object appearance and motion characteristics are demonstrated using a range of semi-supervised anomaly detection approaches

    Extended patch prioritization for depth filling within constrained exemplar-based RGB-D image completion.

    Get PDF
    We address the problem of hole filling in depth images, obtained from either active or stereo sensing, for the purposes of depth image completion in an exemplar-based framework. Most existing exemplar-based inpainting techniques, designed for color image completion, do not perform well on depth information with object boundaries obstructed or surrounded by missing regions. In the proposed method, using both color (RGB) and depth (D) information available from a common-place RGB-D image, we explicitly modify the patch prioritization term utilized for target patch ordering to facilitate improved propagation of complex texture and linear structures within depth completion. Furthermore, the query space in the source region is constrained to increase the efficiency of the approach compared to other exemplar-driven methods. Evaluations demonstrate the efficacy of the proposed method compared to other contemporary completion techniques

    Real-Time Monocular Depth Estimation using Synthetic Data with Domain Adaptation via Image Style Transfer

    Get PDF
    Monocular depth estimation using learning-based approaches has become promising in recent years. However, most monocular depth estimators either need to rely on large quantities of ground truth depth data, which is extremely expensive and difficult to obtain, or predict disparity as an intermediary step using a secondary supervisory signal leading to blurring and other artefacts. Training a depth estimation model using pixel-perfect synthetic data can resolve most of these issues but introduces the problem of domain bias. This is the inability to apply a model trained on synthetic data to real-world scenarios. With advances in image style transfer and its connections with domain adaptation (Maximum Mean Discrepancy), we take advantage of style transfer and adversarial training to predict pixel perfect depth from a single real-world color image based on training over a large corpus of synthetic environment data. Experimental results indicate the efficacy of our approach compared to contemporary state-of-the-art techniques

    DepthComp: Real-time Depth Image Completion Based on Prior Semantic Scene Segmentation

    Get PDF
    We address plausible hole filling in depth images in a computationally lightweight methodology that leverages recent advances in semantic scene segmentation. Firstly, we perform such segmentation over a co-registered color image, commonly available from stereo depth sources, and non-parametrically fill missing depth values based on a multipass basis within each semantically labeled scene object. Within this formulation, we identify a bounded set of explicit completion cases in a grammar inspired context that can be performed effectively and efficiently to provide highly plausible localized depth continuity via a case-specific non-parametric completion approach. Results demonstrate that this approach has complexity and efficiency comparable to conventional interpolation techniques but with accuracy analogous to contemporary depth filling approaches. Furthermore, we show it to be capable of fine depth relief completion beyond that of both contemporary approaches in the field and computationally comparable interpolation strategies

    Veritatem Dies Aperit - Temporally Consistent Depth Prediction Enabled by a Multi-Task Geometric and Semantic Scene Understanding Approach

    Get PDF
    Robust geometric and semantic scene understanding is ever more important in many real-world applications such as autonomous driving and robotic navigation. In this paper, we propose a multi-task learning-based approach capable of jointly performing geometric and semantic scene understanding, namely depth prediction (monocular depth estimation and depth completion) and semantic scene segmentation. Within a single temporally constrained recurrent network, our approach uniquely takes advantage of a complex series of skip connections, adversarial training and the temporal constraint of sequential frame recurrence to produce consistent depth and semantic class labels simultaneously. Extensive experimental evaluation demonstrates the efficacy of our approach compared to other contemporary state-of-the-art techniques.Comment: CVPR 201

    Monocular Segment-Wise Depth: Monocular Depth Estimation Based on a Semantic Segmentation Prior

    Get PDF
    Monocular depth estimation using novel learning-based approaches has recently emerged as a promising potential alternative to more conventional 3D scene capture technologies within real-world scenarios. Many such solutions often depend on large quantities of ground truth depth data, which is rare and often intractable to obtain. Others attempt to estimate disparity as an intermediary step using a secondary supervisory signal, leading to blurring and other undesirable artefacts. In this paper, we propose a monocular depth estimation approach, which employs a jointly-trained pixel-wise semantic understanding step to estimate depth for individuallyselected groups of objects (segments) within the scene. The separate depth outputs are efficiently fused to generate the final result. This creates more simplistic learning objectives for the jointly-trained individual networks, leading to more accurate overall depth. Extensive experimentation demonstrates the efficacy of the proposed approach compared to contemporary state-of-the-art techniques within the literature
    corecore